Method for PMU data recovery using an improved cubic spline interpolation and singular value decomposition

ABSTRACT

A data recovery method framework is described, in which the data is classified as either ambient or disturbance data, and recovered by different methods to achieve good performance efficiently. An approach based on decision tree is described for identifying ambient and disturbance data. Then, an improved cubic spline interpolation based on the priority allocation strategy is described for ambient data loss, which can quickly and accurately recover ambient data. Simultaneously, a disturbance data recovery method based on singular value decomposition is described. It can achieve disturbance data recovery accurately by a single channel of measurement.

BACKGROUND

A high penetration of renewable energies into the modern grid createsrandomness and uncertainties which require advanced real-time monitoringand control. Phasor measurement units (PMUs) and wide-area measurementsystems (WAMS) show great promise for operation monitoring and stabilityenhancement due to characteristics of synchronization, rapidity, andaccuracy. However, different levels of data loss issues can occur inpractical applications as a result of varying conditions, includingcommunication congestion, hardware failure, and transmission delay. Dataloss issues can severely restrict such monitoring applications in powersystems, and may even threaten the security of the grid.

SUMMARY

To address this problem, an adaptive PMU missing data recovery method isdescribed in this disclosure. A data recovery method framework isdescribed, in which the data is classified as either ambient ordisturbance data, and recovered by different methods to achieve goodperformance efficiently. An approach based on decision tree is describedfor identifying ambient and disturbance data. Then, an improved cubicspline interpolation based on the priority allocation strategy isdescribed for ambient data loss, which can quickly and accuratelyrecover ambient data. Simultaneously, a disturbance data recovery methodbased on singular value decomposition is described. It can achievedisturbance data recovery accurately by a single channel of measurement.Finally, the feasibility and accuracy of the described methods areverified through simulation and hardware-based test platform fed byfield recorded data. The simulation and testing results show that thismethod can achieve data identification and recovery efficiently solelybased on data, and that applying the described method to all aspects ofthe power system can provide superior PMU measurement guarantees.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to facilitate a fuller understanding of the present invention,reference is now made to the attached drawings. The drawings should notbe construed as limiting the present invention, but are intended only toillustrate different aspects and embodiments of the invention.

FIG. 1 shows an example framework diagram of a data recovery algorithm.

FIG. 2 shows an example diagram of a data group.

FIG. 3 shows exemplary nodes of a decision tree according to an exampleembodiment.

FIG. 4A shows exemplary results of optimal parameters.

FIG. 4B shows exemplary results of optimal parameters.

FIG. 5 shows an exemplary diagram displaying an odd number of datapoints missing.

FIG. 6 shows an exemplary diagram displaying an even number of datapoints missing.

FIG. 7 shows an example schematic diagram of signal component phasorduring amplitude modulation.

FIG. 8 shows an example schematic diagram of PMU measurement data.

FIG. 9 shows an example schematic diagram of Gershgorin disk method.

FIG. 10 shows exemplary relationships between loss ratio, matrix rownumber, and recovery accuracy.

FIG. 11 shows a flow chart of PMU data identification and recoveryalgorithm according to an example embodiment.

FIG. 12 shows example identification result of ambient and disturbancedata.

FIG. 13 shows example identification result of ambient and disturbancedata when some data is lost.

FIG. 14 shows example results of three methods for different data losstypes.

FIG. 15 shows a test platform according to an example embodiment.

FIGS. 16 and 17 show example PMU measured inter-harmonic data.

FIG. 18 shows example identification result of ambient and disturbancedata.

FIG. 19 shows example identification results of field data fromdifferent PMUs.

FIG. 20 shows identification result of ambient and disturbance data whensome data is lost.

FIG. 21 shows example comparison of single data recovery results.

FIG. 22 shows example comparison of contiguous data and non-contiguousdata recovery results.

FIG. 23 shows example comparison of contiguous data and non-contiguousdata recovery results.

FIG. 24 illustrates exemplary hardware components for a computer systemincluding a controller.

DETAILED DESCRIPTION

Exemplary embodiments of the invention will now be described in order toillustrate various features of the invention. The embodiments describedherein are not intended to be limiting as to the scope of the invention,but rather are intended to provide examples of the components, use, andoperation of the invention.

1. Introduction

The largest and most complex interconnected power system in the worldhas been built in China, which holds the highest transmission voltagelevel and largest capacity of renewable energy globally. However, theoperation modes and dynamic characteristics of the system have becomeincreasingly complicated due to the growing scale of the grid,penetration level of renewables, and integration of distributedgenerations. Real-time monitoring and precise close-loop control arerequired to ensure the stability of the grid, and can be carried out byphasor measurement units (PMUs) owing to its synchronization, rapidity,and accuracy. Potential applications based on PMU data, including stateestimation, parameter identification, and wide-area protection andcontrol, have also drawn much attention recently. At present, about 3000PMUs have been installed and put into operation in China, covering themajority of 220 kV and above substations, main power plants, andgrid-connected renewable energy collections. In addition, it is reportedthat around 2500 commercial PMUs have been installed in North America.

Due to communication congestion, hardware failures, transmission delays,and other issues, on-site PMUs typically experience different degrees ofdata quality issues. According to the 2011 Five-Year Plan issued by theCalifornia Independent System Operator, around 10 to 17% of PMU data inNorth America corrupted, and this ratio is as high as 20 to 30% inChina. The quality of PMU data severely restricts its variousapplications in the power system, in which data loss is a key factoraffecting data quality. Loss of PMU data can lead to reduced systemobservability, affecting state estimation and parameter identification,and even threatening grid security. Therefore, restoring PMU missingdata has become increasingly important.

Some use PMU data recovery methods such as interpolation, low-rankmatrix method, sparse matrix method, and state estimation method. Basedon the statistics of PMU data loss in North America, a Lagrangeinterpolation method for effectively recovering lost data has beenconsidered. The time-series prediction model can be combined with Kalmanfiltering and smoothing algorithm, and a PMU data adjustment method canbe presented to effectively improve data accuracy. The methods based onlinear interpolation are simple and easy to calculate, but when there isthe disturbance data, the recovery accuracy is low. To improve dataquality, an online algorithm according to the low-rank characteristicsof PMU data can be used. By exploiting PMU measurement to construct aHankel matrix, this method can effectively recover missing data. Inaddition, some have considered data recovery from the perspective ofdata sparsity, and the state space model and constructor function can beused to predict the frequency changes of power systems. Although thesemethods can recover ambient data, there are still few adequate recoverymethods for disturbance data. At the same time, 95% of data loss eventsonly involve up to three contiguous data points, and a large number ofcontiguous data loss or non-contiguous data loss occurs much lessfrequently. Therefore, it is desirable to develop an algorithm that caneffectively recover the contiguous loss of ambient and disturbance data.

In this disclosure, an adaptive PMU missing data recovery method isdescribed. Aiming at the different features of ambient and disturbancedata, a data identification method based on a decision tree ispresented. It can effectively identify data even when large amounts ofdata are lost. On this basis, for the ambient data, a recovery methodbased on improved cubic interpolation with a priority allocationstrategy is described, which can recover the data precisely with lowcalculation burden. For the disturbance data, a recover method based onsingular value decomposition is described, which can recover disturbancedata only relying on a single channel PMU measurement even when the dataloss ratio is high. Finally, the described method is tested bysimulation and field recorded data. Results show that this method canselect an appropriate method for recovery according to the data state,which increases the calculation precision and the calculation efficiencycompared with the existing methods. The findings demonstrate that PMUmeasurement is better applied to all aspects of the power system byusing this method.

2. Data Recovery Method Framework

When the power system is in steady state for a long time, itsmeasurement data is ambient and easy to recover. Under such conditions,the interpolation algorithm can effectively recover the missing data,and the calculation is simple and efficient. However, when there isdisturbance in the system, the measurement changes nonlinearly, and isdifficult to recover the missing data by interpolation which requires amore complicated algorithm. This disclosure describes an adaptive PMUmissing data recovery algorithm which can achieve accurate recovery ofdifferent kinds of missing data.

FIG. 1 shows an example framework diagram of a data recovery algorithm.In this example embodiment, in step 110, the PMU measurement (data) isobtained. In step 120, the data is divided into ambient data anddisturbance data. In step 131, the ambient data is restored by arecovery algorithm based on interpolation. In step 132, the disturbancedata is restored by singular value decomposition.

At the same time, in order to simplify the model, the missing data offield PMUs is analyzed and the three types of data loss are summarizedas single data loss type, contiguous data loss type, and non-contiguousdata loss type. Note that the non-contiguous data loss type can beequivalent to the combination of the above two types.

3. Ambient and Disturbance Data Identification Method

3.1. Construction of Decision Tree

The identification of ambient and disturbance data can be consideredequivalent to a binary classification problem. The C4.5 decision tree, atool in machine learning, can be adopted here to solve this problem,because it uses the information gain ratio to select features instead ofthe information gain in the ID3 algorithm to avoid the preference forfeatures with more values.

FIG. 2 shows an example diagram of a data group. Assuming the field PMUdata set from a single channel PMU measurement has a sequence of N=k·ndata points, D=X₁, X₂, . . . , X_(N) over time window t. As shown inFIG. 2, the dataset is divided into n groups (C₁, C₂, . . . , C_(n)).Each data X_(i)∈D is associated with its measurement and system stateflag.

FIG. 2 indicates that there are n groups of data (C₁, C₂, . . . ,C_(n)), and N=k·n. All the data is labelled as ambient and disturbancedata according to whether there is an oscillation or just relativeconstant data with noise. The label of ambient data is 0 and the labelof disturbance data is 1. The state label s_(i) of each group isdetermined by its internal data, for example s₁=0, s₃=1. If there aretwo kinds of data in the group, the state of the group is determined bythe kind of the majority, for example s₂=0, s_(n)=1. In addition, thevariance a and extreme value difference b_(i) are calculated as thefeatures in each group. The variance a indicates the deviation ofamplitudes in a group. The extreme value difference b_(i) indicates thedifference between the maximum and minimum amplitudes. Then, 80% ofgroups are randomly selected as the training set T, and the remaining20% are the test set T′. The training data is used to form a decisiontree. The test data is used to verify the accuracy of the decision tree.The steps of C4.5 decision tree is as followed.

The total information entropy of the training data is calculated by Eq.(1):

$\begin{matrix}{{H(T)} = {- {\sum\limits_{i = 1}^{2}{p_{i}\log_{2}p_{i}}}}} & (1)\end{matrix}$where p₁ indicates the proportion of ambient data group in T, p₂indicates the proportion of disturbance data group in T, and H(T) is theuncertainty of the group state. This information plays a role inselecting the optimal partitioning feature in the decision tree.

It is assumed that feature a is selected to divide T, and the contiguousfeature a is discretized by dichotomy. There are m different values infeature a, and these values are sorted from small to large, {a¹, a², . .. , a^(m)}. Using the median point

$\frac{a^{i} + a^{i + 1}}{2}$of the interval [a^(i)+a^(i+1)) is as the partition point, a set of m−1partition point can be obtained.

$\begin{matrix}{Q = \left\{ {q_{i} = \left. \frac{a^{i} + a^{i + 1}}{2} \middle| {1 \leq i \leq {m - 1}} \right.} \right\}} & (2)\end{matrix}$

For each diving point q_(i), T is divided into subsets T_(q) ⁻ and T_(q)⁺. T_(q) ⁻ contains the groups where a_(i)≤q_(i), and T_(q) ⁺ containsthe groups where a_(i)>q_(i). The information gain of q_(i) is computedaccording to (3):

$\begin{matrix}{{G\left( {T,a,q_{i}} \right)} = {{H(T)} - {\frac{T_{q_{i}}^{-}}{T}{H\left( T_{q_{i}}^{-} \right)}} - {\frac{T_{q_{i}}^{+}}{T}{H\left( T_{q_{i}}^{+} \right)}}}} & (3)\end{matrix}$where |T| represents the number of groups, |T_(q) _(i) ⁻|/|T| is theweight of the groups in which a_(i)≤q_(i), and |T_(q) _(i) ⁺|/|T|represents the weight of the groups where a_(i)>q_(i). The greater theinformation gain is, the better the effect of partition point.

The maximum value is selected from the information gain of all thepartition points as the information gain of the feature a. Because theinformation gain criterion has a preference on features with morevalues, the gain ratio is used to select the optimal feature. It isdefined as follows:

$\begin{matrix}{{g\left( {T,a,q_{i}} \right)} = \frac{G\left( {T,a,q_{i}} \right)}{I{V(a)}}} & (4) \\{{I\;{V(a)}} = {- {\sum\limits_{\beta \in {\{{- {, +}}\}}}{\frac{T_{q_{i}}^{\beta}}{T}\log_{2}\frac{T_{q_{i}}^{\beta}}{T}}}}} & (5)\end{matrix}$where IV(a) is the intrinsic value.

Therefore, the partition point q_(a) with the largest gain ratio g(T, a,q_(a)) is selected as the branch node of the decision tree.

FIG. 3 shows exemplary nodes of a decision tree according to an exampleembodiment. The decision tree depth d and the threshold ε of theinformation gain ratio are set first, which determines the accuracy ofthe classification. d indicates the number of recursive calculations.And the end condition of the decision tree is as follows: if the maximumnumber of calculations exceeds d, the partition is stopped; if theinformation gain ratio of each feature is less than ε, it is notdivided. Meanwhile, if all the states in the division result areconsistent, the division is stopped.

In step 310, all the groups C_(i) are provided as inputs. In step 320,the gain ratios of the features (a, b) are calculated, and the largestone is selected for comparison with the threshold ε. If the gain ratiois less than ε, the tree is a single node tree (325). It is consideredthat the state of all the groups is the state of the majority. If not,it is assumed that g(T, a, q_(a)) is the largest one. In step 330, a_(i)is compared with q_(a), and in step 340, all groups C_(i) witha_(i)≤q_(a) are placed in one class, and the remaining are placed inanother class. This node is called the branch node. The above steps arethen repeated recursively (350) until all groups in one class are in thesame state, or the maximum gain ratio is less than ε, as illustrated bythe dotted box in FIG. 3. The last node is called leaf node (360) andthis completes the decision tree process.

A decision function is used to illustrate that the test data T′ containswhich kind of data. The test data is equally divided into groups C′_(i),and each group is input into the decision tree to determine itscorresponding state s_(i). The decision function is as follows.

$\begin{matrix}{{f\left( T^{\prime} \right)} = \left\{ \begin{matrix}{0,} & {{s_{i} = 0},{i = 1},2,\ldots\mspace{14mu},\left\lbrack {{T^{\prime}}/k} \right\rbrack} \\{1,} & {{s_{i} = 1},{i = 1},2,\ldots\mspace{14mu},\left\lbrack {{T^{\prime}}/k} \right\rbrack} \\{2,} & {{s_{i} \neq s_{j}},{\forall{i \neq j}},i,{j = 1},2,\ldots\mspace{14mu},\left\lbrack {{T^{\prime}}/k} \right\rbrack}\end{matrix} \right.} & (6)\end{matrix}$

Eq. (6) shows that if all s_(i)=0, the function result is 0. It meansthe test data is all ambient data, and the missing data can be recoveredby the ambient data recovery method. If all s_(i)=1, the function resultis 1. It means the test data is all disturbance data, and the missingdata can be recovered by the disturbance data recovery method. Ifs_(i)≠s_(j), the function result is 2. It means both ambient anddisturbance data are included in the test set, and the missing data canbe recovered by the corresponding method.

When data loss occurs in the PMU measurement, the identificationalgorithm only uses the existing data to do the identification. Theimpact of the missing data on the identification results are shown in6.1.1 and 6.2.2, respectively.

3.2. Algorithm Parameter Setting

At the beginning of the algorithm, a threshold ε of the information gainratio and the depth of the decision tree d must be set. FIG. 4A showsexemplary results of optimal parameters, in particular, shows therelationship between E and accuracy and FIG. 4B shows exemplary resultsof optimal parameters, in particular, shows the relationship between dand accuracy. The optimal parameters can be selected by traversing, asshown in FIG. 4A and FIG. 4B.

It can be seen from FIG. 4A that when s gradually increases to 0.006,the accuracy of the test data reaches the maximum and remains constant.Therefore, the threshold must be set to 0.006. FIG. 4B illustrates thatwhen the depth of the decision tree is 5, the accuracy of the test datais the highest. When the depth is greater than 5, the structure of thedecision tree is too complicated and leads to over-fitting, which inturn lowers the test data accuracy.

4. Recovery Method for Ambient Data

4.1. Priority Allocation Strategy

When a single data point is lost, there is no need to consider therecovery order. If there is a large amount of missing data, the existingmethod requires a large amount of calculation, and the error is large.To restore the original information as much as possible, this disclosuredescribes a priority allocation strategy based on the dynamic change ofinterpolation interval, which can effectively improve the recoveryaccuracy.

The contiguous data loss type is taken as an example here. According tothe parity of missing data, the strategy should be discussed in twocases, e.g., an odd number of data points missing and an even number ofdata points missing. FIG. 5 shows a diagram displaying an odd number ofdata points missing. FIG. 6 shows a diagram displaying an even number ofdata points missing. The square represents the PMU measurement over aperiod of time. The square (505 a, 505 b, 505 c, 505 d and 505 e inFIGS. 5 and 605 a, 605 b, 605 c, 605 d, 605 e and 605 f in FIG. 6) ismissing data, and the square (510 a, 510 b, 510 c, 510 d, 510 e and 510f in FIGS. 5 and 610 a, 610 b, 610 c and 610 d in FIG. 6) is known data.

All PMU data is taken as input. The number of missing data points is setto E, and the interpolation interval is set to Z. When E is odd,

$Z_{1} = {\frac{E + 1}{2}.}$X_(n) is to be recovered in the first stage using eight data points(X_(n−4Z) ₁ , X_(n−3Z) ₁ , X_(n−2Z) ₁ , X_(n−Z) ₁ , X_(n+Z) ₁ , X_(n+2Z)₁ , X_(n+3Z) ₁ , X_(n+4Z) ₁ ) with the interval Z₁, where

${n = \left\lceil \frac{E}{2} \right\rceil}.$Next, X_(n−2), and X_(n+2) are recovered in the second stage using theexisting data between which the interval is Z₂, and the data recoveredin the first stage (X_(n)), where Z₂=Z₁−1. Finally, X_(n−1) and X_(n+1)are recovered in the third stage, using the existing data between whichthe interval is Z₃, and the data recovered in the first two stages(X_(n−2), X_(n), X_(n+2)), where Z₃=Z₂−1.

When E is even,

$Z_{1} = {\frac{E + 2}{2}.}$Both X_(n) and X_(n−1) are recovered in the first stage using eight datapoints (X_(n−4Z) ₁ , X_(n−3Z) ₁ , X_(n−2Z) ₁ , X_(n−Z) ₁ , X_(n+Z) ₁ ,X_(n+2Z) ₁ , X_(n+3Z) ₁ , X_(n+4Z) ₁ and X_(n−1−4Z) ₁ , X_(n−1−3Z) ₁ ,X_(n−1−2Z) ₁ , X_(n−1−Z) ₁ , X_(n−1+Z) ₁ , X_(n−1+2Z) ₁ , X_(n−1+3Z) ₁ )with the interval Z₁, respectively. X_(n−3) and X_(n+2) are recovered inthe second stage using the existing data between which the interval isZ₂, and the data recovered in the first stage (X_(n−1), X_(n)), whereZ₂=Z₁−1. In the third stage, X_(n−2) and X_(n+1) are recovered using theexisting data between which the interval is Z₃, and the data recoveredin the first two stages (X_(n−3), X_(n−1), X_(n), X_(n+2)), whereZ₃=Z₂−1.

4.2. Data Recovery Based on Improved Cubic Spline Interpolation

Based on the priority allocation strategy, this disclosure uses thecubic spline interpolation method to recover missing data. The cubicspline interpolation can ensure the first and second derivatives of thefunction are contiguous. Meanwhile, the curve obtained by the cubicspline interpolation has good smoothness and strong linear approximationability which can effectively represent the data change and avoid theRunge phenomenon caused by the high-order polynomial. However, otherinterpolation functions like Lagrange, Newton or Piecewise interpolationdo not have such a property. In addition, the effect of theinterpolation depends not only on the endpoint data of the missinginterval, but also on data points whose interval is equidistant from themissing data. Based on the priority allocation strategy, the periodicsignal in the time series is not impaired. This is because the intervalbetween the known points and the missing points is equal.

The cubic spline interpolation function is obtained from known data, andthe missing data can be recovered because the function has a good fitwith the known data. The function construction method is as follows.

A function X=ƒ(t) of voltage amplitude X_(i) and time t_(i) isconstructed, where t_(i)∈[t_(a), t_(b)] and i=0, 1, . . . , n.

If S(t) satisfies: 1) On each subinterval [t_(k), t_(k+1)] (k=1, 2, . .. , n−1), S(t) is a polynomial no more than three times, andS(t_(i))=X_(i). 2) S(t), S′(t), and S″(t) are contiguous on interval[t_(a), t_(b)]. Therefore, S(t) is the cubic spline interpolationfunction of ƒ(t) on node t₀, t₁, . . . , x_(n).

Set M_(i)=S″(t_(i)), and the first derivative at the boundary of eachinterval of the cubic spline interpolation function is smoothed whilethe second derivative is contiguous. Therefore, in interval [t_(i),t_(i+1)], the second derivative of S(t) is expressed as:

$\begin{matrix}{{S^{''}(t)} = {{M_{i}\frac{t_{i + 1} - t}{h_{i}}} + {M_{i + 1}\frac{t - t_{i}}{h_{i}}}}} & (7)\end{matrix}$where h_(i)=t_(i+1)−t_(i). Two consecutive integrations are thenperformed on formula (7), obtaining:

$\begin{matrix}{{S^{\prime}(t)} = {{{- M_{i}}\frac{\left( {t_{i + 1} - t} \right)^{2}}{2h_{i}}} + {M_{i + 1}\frac{\left( {t - t_{i}} \right)^{2}}{2h_{i}}} + \frac{X_{i + 1} - X_{i}}{h_{i}} - {\frac{M_{i + 1} - M_{i}}{6}h_{i}}}} & (8) \\{{S(t)} = {{M_{i}\frac{\left( {t_{i + 1} - t} \right)^{3}}{6h_{i}}} + {M_{i + 1}\frac{\left( {t - t_{i}} \right)^{3}}{6h_{i}}} + {\left( {\frac{X_{i}}{h_{i}} - {\frac{M_{i}}{6}h_{i}}}\  \right)\left( {t_{i + 1} - t} \right)} + {\left( {\frac{X_{i + 1}}{h_{i}} - {\frac{M_{i + 1}}{6}h_{i}}}\  \right)\left( {t - t_{i}} \right)}}} & (9)\end{matrix}$

Using continuity S′(t_(i−))=S′(t_(i+)), Eq. (10) is as follows:μ_(i) M _(i−1)+2M _(i)+λ_(i) M _(i) =d _(i) (i=1,2, . . . ,n−1)  (10)where,

$\begin{matrix}\left\{ \begin{matrix}{{\beta_{i} = \frac{h_{i - 1}}{h_{i} + h_{i - 1}}},{\lambda_{i} = {1 - \beta_{i}}}} \\{d_{i} = {6\left( {\frac{X_{i + 1} - X_{i}}{h_{i}} - \frac{X_{i} - X_{i - 1}}{h_{i - 1}}} \right)\left( {h_{1} + h_{i - 1}} \right)^{- 1}}}\end{matrix} \right. & (11)\end{matrix}$

According to standard boundary conditions S′(t₀)=X′₀, S′(t_(n))=X′_(n),Eq. (10) can be written in the following matrix form:

$\begin{matrix}{{\begin{bmatrix}2 & 1 & \; & \; & \; & \; \\\beta_{1} & 2 & \lambda_{1} & \; & \; & \; \\\; & \beta_{2} & 2 & \lambda_{2} & \; & \; \\\; & \; & \ddots & \ddots & \ddots & \; \\\; & \; & \; & \beta_{n - 1} & 2 & \lambda_{n - 1} \\\; & \; & \; & \; & 1 & 2\end{bmatrix}\begin{bmatrix}M_{0} \\M_{1} \\M_{2} \\\vdots \\M_{n - 1} \\M_{n}\end{bmatrix}} = \begin{bmatrix}d_{0} \\d_{1} \\d_{2} \\\vdots \\d_{n - 1} \\d_{n}\end{bmatrix}} & (12)\end{matrix}$

The coefficient matrix in Eq. (12) is strictly diagonally dominant andhas a unique solution.

The standard cubic spline interpolation function is a function of theboundary condition. According to the ambient data characteristics, inorder to make the spline curve flatter, the minimum difference isrequired to be minimum.

In the interval [t_(i−1), t_(i)] (i=1, 2, . . . , n), the extreme valuesof S(t) are S_(i) max and S_(i) min. Thus, the extreme difference ofS(t) in the interval [t_(a), t_(b)] is:

$\begin{matrix}{Q = {{\max\limits_{1 \leq i \leq n}\left( {S_{i}\max} \right)} - {\min\limits_{1 \leq i \leq n}\left( {S_{i}\min} \right)}}} & (13)\end{matrix}$

Assuming S′(t₀)=X′₀, S′(t_(n))=X′_(n) is unknown, the extreme differenceis a function of M₀ and M_(n).

$\begin{matrix}{{\min\; Q} = {{\max\limits_{1 \leq i \leq n}\left( {S_{i}\max} \right)} - {\min\limits_{1 \leq i \leq n}\left( {S_{i}\min} \right)}}} & (14)\end{matrix}$

Function (14) is an unconstrained nonlinear programming problem for M₀and M_(n). The objective function contains extremum operations, so itcan be calculated by direct method.

In summary, the ambient data recovery method steps are as follows: 1)Data loss is analyzed and the type of data loss is determined. 2) Thedata recovery is determined according to the priority allocationstrategy. 3) An improved cubic spline interpolation function is foundthat satisfies the extreme value difference constraint and the missingdata is recovered. The number of optimal interpolation points can beprovided in these techniques.

5. Recovery Method for Disturbance Data

In addition to the ambient data, most of the other data in the powersystem is nonlinear and disturbed. It is difficult for the interpolationalgorithm to recover disturbance data. Therefore, this disclosure adoptsthe method based on singular value decomposition to extract usefulinformation from short-term, noise measurement, then recovers themissing data.

5.1. Characteristic Component Decomposition of Phasor

In this disclosure, the PMU measurement phasor is decomposed, andcharacterized as including characteristic components of differentcomponents. Taking the measurement information under system oscillationas an example, it can be characterized by an amplitude modulation signalcontaining a plurality of modulation components. The signal expressionis as follows:

$\begin{matrix}{{x(t)} = {{\sqrt{2}{\sum\limits_{i = 1}^{\frac{k - 1}{2}}{2X_{i}{\cos\left( {{2\pi\; f_{i}t}\; + \varphi_{i}} \right)}{\cos\left( {{2\pi\; f_{0}t} + \varphi_{0}} \right)}}}} + {\sqrt{2}X_{0}{\cos\left( {{2\pi f_{0}t} + \varphi_{0}} \right)}} + {n(t)}}} & (15)\end{matrix}$where X₀ is fundamental phasor magnitude, ƒ₀ is power frequency, φ₀ isthe initial phase angle, 2X_(i) is amplitude modulation depth of eachmodulation component, ƒ_(i) is the frequency of response modulation,φ_(i) is the phase angle of response modulation, and n(t) is the noise.

By trigonometric function transformation, formula (15) can be written asfollows:

$\begin{matrix}\begin{matrix}{{x(t)} = {{{\sqrt{2}\left\lbrack {X_{0} + {\sum\limits_{i = 1}^{\frac{k - 1}{2}}{2X_{i}{\cos\left( {{2\pi\; f_{i}t} + \varphi_{i}}\  \right)}}}} \right\rbrack}{\cos\left( {{2\pi\; f_{0}t} + \varphi_{0}} \right)}} + {n(t)}}} \\{= {{\sqrt{2}X_{0}{\cos\left( {{2\pi f_{0}t} + \varphi_{0}} \right)}} + {\sqrt{2}{\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}{\cos\left\lbrack {{2{\pi\left( {f_{0} + f_{i}} \right)}t} +} \right.}}}}}} \\{\left. \left( {\varphi_{0} + \varphi_{i}} \right) \right\rbrack + {\sqrt{2}{\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}{\cos\left\lbrack {{2\;\pi\;\left( {f_{0} - f_{i}} \right)t} + \left( {\varphi_{0} - \varphi_{i}} \right)} \right\rbrack}}}} + {n(t)}}\end{matrix} & (16)\end{matrix}$

It can be seen that the amplitude modulated signal containing differentmodulation components can be equivalent to that being superposed bysignals of different frequencies. Its phasor form is:

$\begin{matrix}{\overset{.}{X} = {{{X_{0}e^{j{({{2\pi\; f_{0}t} + \varphi_{0}})}}} + {\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}e^{j{\lbrack{{2{\pi{({f_{0} + f_{i}})}}t} + {({\varphi_{0} + \varphi_{i}})}}\rbrack}}}} + {\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}e^{j{\lbrack{{2{\pi{({f_{0} - f_{i}})}}t} + {({\varphi_{0} - \varphi_{i}})}}\rbrack}}}}} = {{\overset{.}{X}}_{0} + {\sum\limits_{i = 1}^{\frac{k - 1}{2}}\left( {{\overset{.}{X}}_{i} + {\overset{.}{X}}_{i}^{\prime}} \right)}}}} & (17)\end{matrix}$where {dot over (X)}_(i) and {dot over (X)}′_(i) are the phasors whichcorrespond to the last two terms of the polynomial.

FIG. 7 shows an example schematic diagram of signal component phasorduring amplitude modulation. In FIG. 7, the horizontal axis and thevertical axis represent the real part and the imaginary part. Thecoordinate axes and the fundamental phasor {dot over (X)}₀ rotatesynchronously with the same angular velocity 2πƒ₀. The remaining

$\frac{k - 1}{2}$pairs of phasors {dot over (X)}_(i) and {dot over (X)}′_(i) rotatesymmetrically about {dot over (X)}₀ with the angular velocity 2πƒ_(i).Therefore, the phasor amplitude |{dot over (X)}| can be equivalent tothe superposition of the projection of each frequency signal on {dotover (X)}₀, as shown in formula (18).

$\begin{matrix}{{\overset{.}{X}} = {{X_{0} + {\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}{\cos\left( {{2\pi\; f_{i}t} + \varphi_{i}} \right)}}} + {\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}^{\prime}{\cos\left( {{2\pi\; f_{i}t} + \varphi_{i}} \right)}}}} = {X_{0} + {2{\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}{\cos\left( {{2\pi\; f_{i}t} + \varphi_{i}} \right)}}}}}}} & (18)\end{matrix}$

The amplitude is analyzed as an object, assuming that the PMU measuresthe phasor upload frequency as ƒ_(N), and the time window is t.

FIG. 8 shows an example schematic diagram of PMU measurement data. Thedata can be X=[X_(t) ₁ , X_(t) ₂ , . . . , X_(t) _(N−1) , X_(t) _(N) ].In FIG. 8,

$X_{t_{1}} = {X_{0} + {2{\sum\limits_{i = 1}^{\frac{k - 1}{2}}{X_{i}{{\cos\left( {{2\pi\; f_{i}t_{1}} + \varphi_{i}} \right)}.}}}}}$The data number is N=f_(N)·T, which is divided into three categories:training data X_(train)=[X_(t) ₁ , X_(t) ₂ , . . . , X_(t) _(m−1) ,X_(t) _(n+1) , X_(t) _(n+2) , . . . , X_(t) _(N) ], verification dataX_(test)=[X_(t) _(m−4) , X_(t) _(m−3) , X_(t) _(m−2) , X_(t) _(m−1) ,X_(t) _(n+1) , X_(t) _(n+2) , X_(t) _(n+3) , X_(t) _(n+4) ], and themissing data X_(lost)=[X_(t) _(m) , X_(t) _(m+1) , . . . , X_(t) _(n−1), X_(t) _(n) ]. Among them, the training and verification data are knowndata, which are used to decompose the characteristic components of themeasurement data and verify the accuracy of the recovered data. Theinitial value of the missing data is zero, which is recovered by thedecomposed characteristic components.

According to the Takens theorem, the one-dimensional time series X ismapped to the trajectory matrix Y of the high-dimensional space.

$\begin{matrix}{X = {\left. \left\lbrack {X_{t_{1}},X_{t_{2}},\ldots\mspace{14mu},\ X_{t_{N}}} \right\rbrack\rightarrow Y \right. = \left\lbrack {Y_{1},Y_{2},{\ldots\mspace{14mu} Y_{K}}} \right\rbrack}} & (19) \\{{Y_{z} = \left\lbrack {X_{z},X_{z + 1},\ldots\mspace{14mu},X_{z + L - 1}} \right\rbrack^{T}},{z = 1},2,\ldots\mspace{14mu},K} & \; \\{Y = \begin{bmatrix}X_{t_{1}} & X_{t_{2}} & \ldots & X_{t_{m}} & \ldots & X_{t_{n}} & \ldots & X_{t_{K}} \\X_{t_{2}} & X_{t_{3}} & \ldots & X_{t_{m + 1}} & \ldots & X_{t_{n} + 1} & \ldots & X_{t_{K + 1}} \\\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\X_{t_{L}} & X_{t_{L + 1}} & \ldots & X_{t_{m - 1 + L}} & \ldots & X_{t_{n - 1 + L}} & \ldots & X_{t_{N}}\end{bmatrix}} & (20)\end{matrix}$where the matrix Y is a Hankel matrix, which means that Y has equalelements Y_(ij) on the anti-diagonals i+j=const. Subscript L is thenumber of matrix rows, which can measure the dimension of the matrix,and K is the number of matrix columns, K=N−L+1.

The covariance matrix of Y is S=YY^(T). The eigenvalue decomposition ofthe matrix S is then performed to determine the characteristics of eachcomponent. It is denoted by the eigenvalues (λ₁≥ . . . ≥λ_(L)) of Staken in the decreasing order of magnitude, and U₁, U₂, . . . , U_(L) isthe orthonormal system of the eigenvectors of the matrix correspondingto λ. Additionally, V_(i)=Y^(T)U_(i)/√{square root over (λ_(i))}, andi=1, 2, . . . , L. Next, d is set as the subscript corresponding to thelargest eigenvalue, that is, the rank of the matrix Y.d=rank Y=max{i|s.t.λ _(i)>0}  (21)

Thus, the trajectory matrix Y can be written as:Y=Y ₁ +Y ₂ + . . . +Y _(k−1) +Y _(k) +Y _(k+1) + . . . +Y _(d)  (22)where Y_(i)=√{square root over (λ_(i))}U_(i)V_(i) ^(T), i=1, 2, . . . ,d are matrices with rank 1, which are called elementary matrices. Thecollection (√{square root over (λ_(i))}, U_(i), V_(i)) is named the itheigentriple, and {√{square root over (λ_(i))}} are the singular values,providing the singular spectrum of Y. The √{square root over(λ_(i))}V_(i) are called vectors of principal components, whichrepresent the phasors in FIG. 7. For example, Y₁ corresponds to thefundamental phasor. Y₂ and Y₃ correspond to {dot over (X)}₁ and {dotover (X)}′₁, Y_(k−1) and Y_(k) correspond to

${\overset{.}{X}}_{\frac{k - 1}{2}}$and

${\overset{.}{X}}_{\frac{k - 1}{2}}^{\prime},$and Y_(k+1), Y_(k+2), . . . , Y_(d) correspond to the insignificantcomponents such as noise.

Thus, the choice of the number of characteristic components k candirectly affect the sequence reconstruction. If k is too large, it willcontain non-meaningful components such as noise, which can easily causethe inundation of the missing data. If it is too small, it will not beenough to include all the effective characteristics, resulting in lowdata recovery accuracy.

5.2. Optimal Number of Characteristic Components

Most existing methods use the technique of estimating eigenvalues tosolve the characteristic component number, and the robustness of thesetechniques is poor. As the PMU data has a high signal-to-noise ratio andthe noise model is complex, this disclosure selects the Gershgorin diskmethod to obtain the optimal characteristic component number k, which isrobust.

The Gershgorin disk method estimates the eigenvalue position of thematrix S using the disk center o_(i) and the disk radius r_(i). Eachcenter o_(i) is determined by the main diagonal element S_(Tii) of S_(T)after unitary transformation, and the radius r_(i) is:

$\begin{matrix}{{r_{i} = {\sum\limits_{{j = 1},{i \neq j}}^{L}{S_{T_{ij}}}}},{i = 1},2,\ldots\mspace{14mu},L} & (23)\end{matrix}$

The eigenvalues of the matrix S are contained in each disk, and thematrix S must have d disks that are not included in each other. FIG. 9shows an example schematic diagram of Gershgorin disk method. In FIG. 9,the size of each disk represents the specific gravity of eachcharacteristic component. The intersection of the Disk O₁ and Disk O₂has only two eigenvalues. If a characteristic component representsnoise, the disk will infinitely approach zero. Disk O₁ is the proportionof the fundamental amplitude |{dot over (X)}₀| in the phasor magnitude|{dot over (X)}| in FIG. 7 and so on. Disks o_(k+1), o_(k+2), . . . ,o_(d) are the proportion of meaningless components such as noise.

The optimal characteristic component number can therefore be judged bythe number of non-zero radius, and the component number criterion is asfollows:

$\begin{matrix}{{G\; D\;{E(j)}} = {{r_{j} - {\frac{P}{d - 1}{\sum\limits_{i = 1}^{d - 1}r_{i}}}} > 0}} & (24)\end{matrix}$where the adjustment factor P is from 0 to 1, j∈[1, d−2]. When GDE(j)<0for the first time, set j=j₀. Thus, the optimal component number k isj₀−1.

5.3. Original Sequence Reconstruction and Data Recovery

The original sequence can be reconstructed by k characteristiccomponents after filtering the noise. Then:Y≈Y ₁ +Y ₂ + . . . +Y _(k)  (25)

Using the diagonal average criterion, each of the above matrices offormula (25) are converted into a new series of length N. Let X*_(ij) bethe element of the ith row and jth column of the reconstruction matrixY_(i). Then, the qth data in the reconstructed sequence X(1) is theaverage of all X(1)_(q), satisfying i+j=q+1. Set L*=min(L, K), K*=max(L,K), and

$\begin{matrix}{X_{ij}^{*} = \left\{ \begin{matrix}{X_{ij},{L < K}} \\{X_{ji},{L \geq K}}\end{matrix} \right.} & (26)\end{matrix}$

The reconstruction sequence is as follows:

$\begin{matrix}{{X(1)}_{q} = \left\{ \begin{matrix}{{\frac{1}{q}{\sum\limits_{p = 1}^{q}X_{p,{q - p + 1}}^{*}}},} & {1 \leq q < L^{*}} \\{{\frac{1}{L^{*}}{\sum\limits_{p = 1}^{L^{*}}X_{p,{q - p + 1}}^{*}}},} & {L^{*} \leq q \leq K^{*}} \\{{\frac{1}{N - q + 1}{\sum\limits_{p = {q - K^{*} + 1}}^{N - q + 1}X_{p,{q - p + 1}}^{*}}},} & {K^{*} < q \leq N}\end{matrix} \right.} & (27)\end{matrix}$

The above steps reconstruct the original sequence X decomposition intothe sum X(1) of k sequences [X⁽¹⁾, X⁽²⁾, . . . , X^((k))]. Where:

$\begin{matrix}{X^{(k)} = \left\lbrack {X_{t_{1}}^{(k)},X_{t_{2}}^{(k)},\ldots\mspace{14mu},X_{t_{N - 1}}^{(k)},X_{t_{N}}^{(k)}} \right\rbrack} & (28) \\\left. {X(1)}\leftrightarrow\begin{bmatrix}X_{t_{1}}^{(1)} & X_{t_{2}}^{(1)} & \ldots & X_{t_{N - 1}}^{(1)} & X_{t_{N}}^{(1)} \\X_{t_{1}}^{(2)} & X_{t_{2}}^{(2)} & \ldots & X_{t_{N - 1}}^{(2)} & X_{t_{N}}^{(2)} \\\vdots & \vdots & \vdots & \vdots & \vdots \\X_{t_{1}}^{(k)} & X_{t_{2}}^{(k)} & \ldots & X_{t_{N - 1}}^{(k)} & X_{t_{N}}^{(k)}\end{bmatrix} \right. & (29)\end{matrix}$

It can be seen from formula (29) that X⁽¹⁾ is the fundamental phasormagnitude |{dot over (X)}₀|, X⁽²⁾ is |{dot over (X)}₁| cos(2πƒ₁t+φ₁),X⁽³⁾ is |{dot over (X)}′₁| cos(2πƒ₁t+φ₁), and X^((k)) is

${{\overset{.}{X}}_{\frac{k - 1}{2}}^{\prime}}{{\cos\left( {{2\pi\; f_{\frac{k - 1}{2}}t} + \varphi_{\frac{k - 1}{2}}} \right)}.}$

Therefore, the missing data X_(lost) can be replaced by thecorresponding position value of the reconstructed sequence X(1). If theverification data meets the threshold requirement at this time, then:max |X(1)_(test) −X _(test)|<λ  (30)where λ is the artificially set error threshold. This shows that themissing data has been effectively recovered. Otherwise, according to thesame parameters, the sequence X(1) is constructed as a Hankel matrix,decomposed, and reconstructed until the verification data meets thethreshold requirement. The missing data recovery result is thecorresponding position value.

5.4. Algorithm Parameter Setting and Recovery Steps

The loss ratio R, the number of rows of the trajectory matrix L, and theoptimal characteristic component number k, directly influence thegeneration of the reconstructed sequence, which in turn affects theaccuracy of data recovery. The characteristic components number k issolved by Gershgorin disk method, and the remaining parameters aredetermined by simulation. Taking the mentioned amplitude modulationsignal as an example, the optimal characteristic components number k iskept constant, the loss ratio R and the number of matrix rows L arechanged, and the recovery precision is as shown in FIG. 10. As there isno relevant standard for measuring the dynamic performance of the PMU,the amplitude error and the phase angle error are used to measure theaccuracy of the recovered data, and the results can be visuallycompared. This disclosure selects the total vector error (TVE) in theIEEE C37.118.1a-2014 standard to characterize recovery accuracy. Thecalculation formula is as follows:

$\begin{matrix}{{T\; V\; E} = \sqrt{\frac{\left( {X_{r}^{\prime} - X_{r}} \right)^{2} + \left( {X_{i}^{\prime},{- X_{i}}} \right)^{2}}{X_{r}^{2} + X^{2}}}} & (31)\end{matrix}$where X_(r) and X_(i) are the real and imaginary value of theoreticalvalues of the input signal, X′_(r) and X′_(i) are the real and imaginaryvalue of the estimates.

FIG. 10 shows exemplary relationships between loss ratio, matrix rownumber, and recovery accuracy. As seen in FIG. 10, as the loss ratioincreases, the TVE gradually increases. The TVE is symmetricallydistributed with respect to L=0.5 N. When the loss ratio is less than10%, the maximum TVE is 10⁻⁸%, regardless of the value of L. When theloss ratio R is higher than 10% and lower than 40%, the value of L isbetween [0.4 N, 0.6 N], which can keep the TVE below 0.06%. When theloss ratio is higher than 45%, the error is large. To ensure theaccuracy of data recovery and take the impact of data volume on therunning time into account, this disclosure sets the loss ratio R to 30%,and the value of L is between [0.4 N, 0.6 N].

In summary, the steps of data recovery are as follows: 1) The amount ofdata N and the number of rows of the track matrix L, according to theactual data loss position, are determined and the optimal loss ratio Ris 30%. 2) A trajectory matrix is constructed and singular valuedecomposition is performed to extract all the characteristic components.3) To filter out the influence of components such as noise, Gershgorindisk method is combined to solve the optimal characteristic componentnumber k. 4) The original sequence is reconstructed using the optimalcharacteristic component. 5) If the verification data meets thethreshold requirement, the calculation stops. If not, the iteration isrepeated according to the same parameters until the requirements aremet.

5.5. Exemplary Embodiment

FIG. 11 shows a flow chart of a PMU data identification and recoveryalgorithm according to an example embodiment. In this exampleembodiment, the identification and recovery algorithm can be implementedin a unit (“device”) of a PMU. A PMU can include a microcontroller ormicroprocessor, an Analog-to-Digital Converter, a GPS, a local memory,and a communication interface.

In step 1110, the device can obtain phasor data (e.g., datasets C_(i)each including a plurality of data points) from a power network. Thephasor data can include a voltage signal and a current signal. For eachsignal, the device can obtain the amplitude, phase, frequency and ROCOF.

In step 1120, each dataset C_(i) is divided into ambient data ordisturbance data. For example, a binary classification model can be usedto identify and separate ambient and disturbance data. In particular, amachine learning decision tree can be adopted to make thisclassification as follows.

In a first step, all the datasets C_(i) can be provided as inputs to thedecision tree. In a second step, the gain ratios of the features (a, b)are calculated, and the largest one is selected for comparison with athreshold ε. The gain ratio is defined as follows:

$\begin{matrix}{{g\left( {T,a,q_{i}} \right)} = \frac{G\left( {T,a,q_{i}} \right)}{I\;{V(a)}}} & (4) \\{{I\;{V(a)}} = {- {\sum\limits_{\beta \in {\{{- {, +}}\}}}{\frac{T_{q_{i}}^{\beta}}{T}\log_{2}\frac{T_{q_{i}}^{\beta}}{T}}}}} & (5)\end{matrix}$where IV(a) is the intrinsic value; for each dataset C_(i), a_(i) is thevariance, b_(i) is the extreme value difference and q_(i) is divingpoint; and G(T, a, q_(i)) is information gain of q_(i) which can becomputed as follows:

$\begin{matrix}{{G\left( {T,a,q_{i}}\  \right)} = {{H(T)} - {\frac{T_{q_{i}}^{-}}{T}{H\left( T_{q_{i}}^{-} \right)}} - {\frac{T_{q_{i}}^{+}}{T}{H\left( T_{q_{i}}^{+} \right)}}}} & (3)\end{matrix}$where |T| represents the number of groups, |T_(q) _(i) ⁻|/|T| is theweight of the groups in which a_(i)≤q_(i), and |T_(q) _(i) ⁺|/|T|represents the weight of the groups where a_(i)>q_(i).

If the gain ratio is less than E, the tree is a single node tree and itis considered that the state of all the datasets is the state of themajority. If not, it is assumed that g(T, a, q_(a)) is the largest one.

In a third step, a_(i) is compared with q_(a), and all datasets C_(i)with a_(i)≤q_(a) are placed in one class (ambient data—to continue tostep 1130), and the remaining are placed in another class (disturbancedata—to continue to step 1140). The above steps are then repeatedrecursively until all datasets in one class are in the same state, orthe maximum gain ratio is less than ε.

In steps 1130, the ambient data is restored by a recovery algorithmbased on interpolation. In particular, in step 1131, the devicedetermines the number E of data points missing, i.e., even or odd, andthe device determines a type for the missing data, i.e., ambient data ordisturbance data.

In step 1132, a data recovery technique is implemented according to thepriority allocation strategy, i.e., a strategy the case when even datapoints are missing and a strategy for a case when odd data points aremissing. In particular, the interpolation interval is set to Z. When Eis odd,

$Z_{1} = {\frac{E + 1}{2}.}$X_(n) is to be recovered in the first stage using eight data points(X_(n−4Z) ₁ , X_(n−3Z) ₁ , X_(n−2Z) ₁ , X_(n−Z) ₁ , X_(n+Z) ₁ , X_(n+2Z)₁ , X_(n+3Z) ₁ , X_(n+4Z) ₁ ) with the interval Z₁, where

${n = \left\lceil \frac{E}{2} \right\rceil}.$Next, X_(n−2) and X_(n+2) are recovered in the second stage using theexisting data between which the interval is Z₂, and the data recoveredin the first stage (X_(n)), where Z₂=Z₁−1. Finally, X_(n−1) and X_(n+1),are recovered in the third stage, using the existing data between whichthe interval is Z₃, and the data recovered in the first two stages(X_(n−2), X_(n), X_(n+2)), where Z₃=Z₂−1.

When E is even,

$Z_{1} = {\frac{E + 2}{2}.}$Both X_(n) and X_(n−1) are recovered in the first stage using eight datapoints (X_(n−4Z) ₁ , X_(n−3Z) ₁ , X_(n−2Z) ₁ , X_(n−Z) ₁ , X_(n+Z) ₁ ,X_(n+2Z) ₁ , X_(n+3Z) ₁ , X_(n+4Z) ₁ and X_(n−1−4Z) ₁ , X_(n−1−3Z) ₁ ,X_(n−1−2Z) ₁ , X_(n−1−Z) ₁ , X_(n−1+Z) ₁ , X_(n−1+2Z) ₁ , X_(n−1+3Z) ₁ )with the interval Z₁, respectively. X_(n−3) and X_(n+2) are recovered inthe second stage using the existing data between which the interval isZ₂, and the data recovered in the first stage (X_(n−1), X_(n)), whereZ₂=Z₁−1. In the third stage, X_(n−2) and X_(n+1) are recovered using theexisting data between which the interval is Z₃, and the data recoveredin the first two stages (X_(n−3), X_(n−1), X_(n), X_(n+2)), whereZ₃=Z₂−1.

In step 1133, an improved cubic spline interpolation function is foundthat satisfies the extreme value difference constraint and the missingdata is recovered. A function X=ƒ(t) of voltage amplitude X_(i) and timet_(i) is constructed, where t_(i)∈[t_(a), t_(b)] and i=0, 1, . . . , n.If S(t) satisfies: 1) On each subinterval [t_(k), t_(k+1)] (k=1, 2, . .. , n−1, S(t) is a polynomial no more than three times, andS(t_(i))=X_(i). 2) S(t), S′(t), and S″(t) are contiguous on interval[t_(a), t_(b)]. Therefore, S(t) is the cubic spline interpolationfunction of ƒ(t) on node t₀, t₁, . . . , x_(n). Here, the cubic splineinterpolation function is denoted as:

$\begin{matrix}{{S(t)} = {{M_{i}\frac{\left( {t_{i + 1} - t} \right)^{3}}{6h_{i}}} + {M_{i + 1}\frac{\left( {t - t_{i}} \right)^{3}}{6h_{i}}} + {\left( {\frac{X_{i}}{h_{i}} - {\frac{M_{i}}{6}h_{i}}} \right)\left( {t_{i + 1} - t} \right)} + {\left( {\frac{X_{i + 1}}{h_{i}} - {\frac{M_{i + 1}}{6}h_{i}}}\  \right)\left( {t - t_{i}} \right)}}} & (9)\end{matrix}$where M_(i)=S″(t_(i)). The missing data is obtained using the cubicspline interpolation function in step 1150 using the known data.

In steps 1140, the disturbance data is restored by singular valuedecomposition. In particular, in step 1141, the amount of data N and thenumber of rows of the track matrix L, according to the actual data lossposition, are determined and the optimal loss ratio R is 30%.

In step 1142, a trajectory matrix is constructed and singular valuedecomposition is performed to extract all the characteristic components.Specifically, the one-dimensional time series X is mapped to thetrajectory matrix Y of the high-dimensional space to generate thetrajectory matrix Y:

$\begin{matrix}{X = {\left. \left\lbrack {X_{t_{1}},X_{t_{2}},\ldots\mspace{14mu},X_{t_{N}}} \right\rbrack\rightarrow Y \right. = \left\lbrack {Y_{1},Y_{2},{\ldots\mspace{14mu} Y_{K}}} \right\rbrack}} & (19) \\{{Y_{z} = \left\lbrack {X_{z},X_{z + 1},\ldots\mspace{14mu},X_{z + L - 1}} \right\rbrack^{T}},{z = 1},2,\ldots\mspace{14mu},K} & \; \\{Y = \begin{bmatrix}X_{t_{1}} & X_{t_{2}} & \ldots & X_{t_{m}} & \ldots & X_{t_{n}} & \ldots & X_{t_{K}} \\X_{t_{2}} & X_{t_{3}} & \ldots & X_{t_{m + 1}} & \ldots & X_{t_{n + 1}} & \ldots & X_{t_{K + 1}} \\\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\X_{t_{L}} & X_{t_{L + 1}} & \ldots & X_{t_{m - 1 + L}} & \ldots & X_{t_{n - 1 + L}} & \ldots & X_{t_{N}}\end{bmatrix}} & (20)\end{matrix}$where the matrix Y is a Hankel matrix, which means that Y has equalelements Y_(ij) on the anti-diagonals i+j=const. Subscript L is thenumber of matrix rows, which can measure the dimension of the matrix,and K is the number of matrix columns, K=N−L+1.

The covariance matrix of Y is S=YY^(T). The eigenvalue decomposition ofthe matrix S is then performed to determine the characteristics of eachcomponent. It is denoted by the eigenvalues (λ₁≥ . . . ≥λ_(L)) of Staken in the decreasing order of magnitude, and U₁, U₂, . . . , U_(L) isthe orthonormal system of the eigenvectors of the matrix correspondingto λ. Additionally, V_(i)=Y^(T)U_(i)/λ_(i), and i=1, 2, . . . , L. Next,d is set as the subscript corresponding to the largest eigenvalue, thatis, the rank of the matrix Y.d=rank Y=max{i|s.t. λ _(i)>0}  (21)

Thus, the trajectory matrix Y can be written as:Y=Y ₁ +Y ₂ + . . . +Y _(k−1) +Y _(k) +Y _(k+1) + . . . +Y _(d)  (22)where Y_(i)=√{square root over (λ_(i))}U_(i)V_(i) ^(T), i=1, 2, . . . ,d are matrices with rank 1, which are called elementary matrices. Thecollection (√{square root over (λ_(i))}, U_(i), V_(i)) is named the itheigentriple, and {√{square root over (λ_(i))}} are the singular values,providing the singular spectrum of Y.

As an optional step, Gershgorin disk method can be performed to solvethe optimal characteristic component number k. In particular, theoptimal characteristic component number can be determined by the numberof non-zero radius, and the component number criterion is as follows:

$\begin{matrix}{{G\; D\;{E(j)}} = {{r_{j} - {\frac{P}{d - 1}{\sum\limits_{i = 1}^{d - 1}r_{i}}}} > 0}} & (24)\end{matrix}$where the adjustment factor P is from 0 to 1, j∈[1, d−2], and r_(j) andr_(i) are different eigenvalues of matrix S. When GDE(j)<0 for the firsttime, set j=j₀. Thus, the optimal component number k is j₀−1.

In step 1143, the original sequence is reconstructed using the optimalcharacteristic component. Specifically, the original sequence Xdecomposition can be reconstructed into the sum X(1) of k sequences[X⁽¹⁾, X⁽²⁾, . . . , X^((k))]. Where:

$\begin{matrix}{X^{(k)} = \left\lbrack {X_{t_{1}}^{(k)},X_{t_{2}}^{(k)},\ldots\mspace{14mu},X_{t_{N - 1}}^{(k)},X_{t_{N}}^{(k)}} \right\rbrack} & (28) \\\left. {X(1)}\leftrightarrow\begin{bmatrix}X_{t_{1}}^{(1)} & X_{t_{2}}^{(1)} & \ldots & X_{t_{N - 1}}^{(1)} & X_{t_{N}}^{(1)} \\X_{t_{1}}^{(2)} & X_{t_{2}}^{(2)} & \ldots & X_{t_{N - 1}}^{(2)} & X_{t_{N}}^{(2)} \\\vdots & \vdots & \vdots & \vdots & \vdots \\X_{t_{1}}^{(k)} & X_{t_{2}}^{(k)} & \ldots & X_{t_{N - 1}}^{(k)} & X_{t_{N}}^{(k)}\end{bmatrix} \right. & \;\end{matrix}$

Therefore, the missing data X_(lost) can be replaced by thecorresponding position value of the reconstructed sequence X(1).

In step 1144, if the verification data meets the below thresholdrequirement, the calculation stops.max |X(1)_(test) −X _(test)|<λ  (30)

If not, the iteration is repeated according to the same parameters untilthe requirements are met. Specifically, according to the sameparameters, the sequence X(1) is constructed as a Hankel matrix,decomposed, and reconstructed until the verification data meets thethreshold requirement.

In step 1150, the lost data will be recovered. The missing data isobtained using the cubic spline interpolation function using the knowndata.

The recovered data can improve the data quality of PMUs and can be usedfor further applications of power systems, such as power system dynamicmonitoring, power system state estimation, parameter identification,closed-loop control, and so on.

6. Case Studies

The algorithm of this disclosure is tested by simulation. In addition, alaboratory test platform is also set up to verify the method by fieldrecorded PMU data. The results are compared with Lagrange interpolationmethod and OLAP method.

6.1. Simulation

6.1.1. Simulation of Data Identification Method

A simulation signal with five inter-harmonics is input into the traineddecision tree, and the identification result is recorded. The generalexpression of its signal is as follows:

$\begin{matrix}{{x(t)} = {{\sqrt{2}X_{m0}{\cos\left( {{2\pi\; f_{0}t} + \varphi_{0}} \right)}} + {\sqrt{2}{\sum\limits_{i = 1}^{5}{X_{mi}{\cos\left( {{2\pi\; f_{i}t} + \varphi_{i}} \right)}}}}}} & (32)\end{matrix}$where X_(mi) is the amplitude of a certain harmonic phasor, ƒ_(i) is itsharmonic frequency, and φ_(i) is its initial phase angle. Additionally,X_(m1)=1.2 V, ƒ₁=22 Hz, φ₁=0°. X_(m2)=1.5 V, ƒ₂=68 Hz, φ₂=0°. X_(m3)=2.4V, ƒ₃=74 Hz, φ₃=0°. X_(m4)=2.8 V, ƒ₄=84 Hz, φ₄=0°, X_(m5)=3.4 V, ƒ₅=96Hz, and φ₅=0°.

FIG. 12 shows example identification result of ambient and disturbancedata. The horizontal and vertical axes in FIG. 12 represent the voltageamplitude variance and the extreme value difference, respectively. Thedot line 1205 is ambient data and the dot line 1210 is disturbance data.As illustrated in FIG. 12, the ambient and disturbance data is welldistinguished, and accuracy is 100%. The running time of theidentification method is about 0.005 s, which is much shorter than therunning time of the recovery method. So, in this disclosure the runningtime of the described method only contains the running time of therecovery method.

Then, ambient and disturbance data is set lost randomly, includingcontiguous data loss and non-contiguous data loss. FIG. 13 shows exampleidentification result of ambient and disturbance data when some data islost. FIG. 13 shows that regardless of the data loss situation, theidentification algorithm can accurately identify ambient and disturbancedata. The accuracy of non-contiguous data loss is slightly higher thancontiguous data loss. It can be seen that the amount of data loss hasslight effect on the identification algorithm. Because the variance andextreme value difference of each group can represent the differencebetween ambient data and disturbance data. The variance and extremevalue difference of ambient data are smaller than those of disturbancedata. If there is some missing data in a group, the subsequent data willbe filled in the group. The variance and extreme value difference of thegroup will not be affected much. The decision tree can identify thelabel of the group according to the features a_(i) and b_(i).

6.1.2. Simulation of Ambient Data Recovery Method

Under ideal conditions, the data of the power system is ambient and hasno external interference. The general expression of its signal is asfollows:x(t)=√{square root over (2)}X _(m) cos(2πƒ₀ t+φ ₀)  (33)where X_(m) is the phasor amplitude, ƒ₀ represents power frequency, φ₀is the initial phases, X_(m)=57.73 kV, ƒ₀=50 Hz, and φ₀=0.

When the frequency offset occurs, only considering the frequencyvariation caused by load and generator output frequency imbalance, andthe system operates near the rated frequency. The frequency offsetsignal is expressed as follows:x(t)=√{square root over (2)}X _(m) cos[2π(ƒ₀+Δƒ)t+φ ₀]  (34)where Δƒ is the frequency offset. Δƒ=5 Hz.

In the simulation, the size of dataset used in the simulation is 500.Each experiment in the simulation is repeated 10 times. There are threetypes of data loss, which are single data loss, non-contiguous dataloss, and contiguous data loss. The data loss ratio of non-contiguousand contiguous data loss is 30%.

For ambient data, the described method does not require parametersettings. It selects the appropriate known data and order for recoverybased on the priority allocation strategy. The parameters of OLAP methodcan be set using known techniques, where L=10, κ=6, e_(a)=0.002, η=2.4.

1) Optimal Number of Data Points for Interpolation

The single randomly selected missing data is recovered by this method.Tests are repeated 20 times, and results of TVE are the same owing toits periodicity. Results are as followed in Table 1.

TABLE 1 Optimal data points for interpolation Data Points 2 4 6 8 9 10TVE 4.89% 0.16% 0.06% 0.07% 0.07% 0.07%

Table 1 shows that starting from eight points, TVE is constant andmissing data is efficiently recovered. So, eight fore-and-spatial pointsare used to recover missing data.

2) Simulation of Ambient Data Recovery

First, some data of the ideal signal Eq. (33) without noise is randomlyselected as the missing data. The mean values of TVE of recovered dataand the running time of three methods are compared in Tables 2 and 3.

TABLE 2 Comparison of recovery accuracy of three methods Data loss typeTVE_(L)/% TVE_(O)/% TVE_(C)% Single data loss (one data loss) 0 0 0Non-contiguous data loss 0 0 0 (30% loss) Contiguous data loss (30%loss) 0 0 0

TABLE 3 Comparison of running time of three methods Data loss typeTime_(L)/s Time_(O)/s Time_(C)/s Single data loss (one data loss) 1.021.22 1.08 Non-contiguous data loss 2.18 3.28 2.24 (30% loss) Contiguousdata loss (30% loss) 2.24 3.52 2.31

In Tables 2 and 3, L represents the recovery result of Lagrangeinterpolation method, C represents the described recovery result ofambient data recovery method. O represents the recovery result of OLAPmethod. The results show that the amplitude, phase, frequency, andfrequency change rate of the above signal do not change, and the outputresult is constant. At this point, the signal characteristics areobvious, and the PMU missing data is easy to recover. Therefore, threemethods can effectively recover different types of missing data. For therunning time of three methods, when there is single data loss, Time_(L)(1.02 s) is shorter than Time_(C) (1.08 s) and Time_(O) (1.22 s). Whenthere is multiple data loss, Time_(L) is 2.18 s and Time_(C) is 2.24 s.The running time of OLAP becomes 3.28 s. So, the Lagrange method and thedescribed method are faster than the OLAP method, especially when thereis multiple missing data.

When the power system is in normal operation, the frequency offsetusually occurs. According to C37.118.1.a-2014 IEEE Standard, the maximumrange of frequency offset is from −5 Hz to +5 Hz. The signal Eq. (34)can be used to compare the recovery of three methods. FIG. 14 showsexample results of three methods for different data loss types. Tables4, 5 and 6 also show similar results.

In particular, the recovery result of different types of missing datafor the frequency offset signal in the complex plane is illustrated inFIG. 14. The horizontal and vertical axes are the real and imaginaryparts, respectively. The Ls and Lm are the results recovered by Lagrangemethod (s means single missing data, m means multiple missing data, thefollowing are the same), Cs and Cm are the results recovered by thedescribed ambient data recovery method, Os and Om are the resultsrecovered by OLAP method. For single data loss, the result of thedescribed method (TVE=0.07%) is better than Lagrange method (TVE=3.06%),almost the same as OLAP method (TVE=0.05%). For non-contiguous dataloss, the TVE results are presented in Table 4. The mean TVE of thedescribed method is 0.652%, which is much smaller than Lagrange method(TVE=50.05%) and slightly larger than OLAP method (TVE=0.54%). Forcontiguous data loss, Table 5 shows that Lagrange method experiencesserious distortion with an increase of missing data. The mean TVE of thedescribed method is 0.73% and TVE of OLAP method is 0.636%. The recoveryof these two methods is close. The running time of three methods isshown in Table 6. It indicates that when there is single data loss,Time_(L) is 0.86 s. The running time of the described method is similarto Lagrange method. Time_(O) is 1.36 s. When there is multiple missingdata, the running time of OLAP method is about 2 times than the othermethods. Its computational efficiency is affected by the amount ofmissing data. Therefore, the described method can recover the missingdata precisely. And its calculation speed is faster.

TABLE 4 Comparison of recovery accuracy of three methods innon-contiguous data loss Data Point TVE_(L)/% TVE_(O)/% TVE_(C)/% Data 13.06 0.54 0.77 Data 2 12.14 0.68 0.73 Data 3 29.95 0.42 0.64 Data 458.92 0.57 0.71 Data 5 146.18 0.47 0.41

TABLE 5 Comparison of recovery accuracy of three methods in contiguousdata loss Data Point TVE_(L)/% TVE_(O)/% TVE_(C)/% Data 1 3.06 0.73 0.84Data 2 12.14 0.58 0.67 Data 3 29.95 0.67 0.58 Data 4 58.92 0.67 0.81Data 5 101.03 0.53 0.75

TABLE 6 Comparison of running time of three methods Data loss typeTime_(L)/s Time_(O)/s Time_(C)/s Single data loss 0.86 1.36 0.88Non-contiguous data loss 1.02 2.24 1.13 Contiguous data loss 1.26 2.411.32

6.1.3. Simulation of Disturbance Data Recovery Method

There are many types of disturbance signals in power systems. Twodifferent signals are tested. First, we use the signal with amplitudeand phase angle modulation to express the oscillation with lowoscillation frequency. The signal expression is as follows:x(t)=√{square root over (2)}(X _(m) +X _(d) cos(2πƒ_(a) t+φ_(a)))cos(2πƒ₀ t+X _(k) cos(2πƒ_(a) t+φ _(a))+φ₀)  (35)where X_(d) is amplitude modulation depth, X_(k) is phase anglemodulation depth, ƒ_(a) is modulation frequency, φ_(a) is the initialphase angle of modulation part, X_(d)=10%, ƒ_(a) 10 Hz, and X_(k)=5.7°.

For disturbance data, the parameters of the described method are set asfollows: N=500, L=250, λ=0.0001. The parameters of OLAP method can alsobe set by known techniques, where L=10, κ=6, e_(a)=0.002, η=2.4.Consistent with the above steps, the results are recorded. The meanvalues of TVE of recovery results are shown in Table 7, and the runningtime of three methods is in Table 8.

TABLE 7 Comparison of recovery accuracy of the signal with amplitude andphase angle modulation Data loss type TVE_(L)/% TVE_(O)/% TVE_(S)/%Single data loss (one data loss) 2.50 0.85 0.28 Non-contiguous data loss5.37 1.32 0.38 (30% loss) Contiguous data loss (30% loss) 5.22 1.54 0.77

TABLE 8 Comparison of running time of three methods Data loss typeTVE_(L)/% TVE_(O)/% TVE_(S)/% Single data loss (one data loss) 0.99 1.423.62 Non-contiguous data loss 2.04 4.54 4.45 (30% loss) Contiguous dataloss (30% loss) 2.17 4.47 4.39

As seen from Tables 7 and 8, S means the described disturbance datarecovery method. The accuracy of disturbance data recovery method ismuch higher than Lagrange method and improved at least 2 times than OLAPmethod. For the running time, when there is single data loss, Time_(L)(0.99 s) is shorter than Time_(O) (1.42 s) and Time_(S) (3.62 s). Wherethere is multiple data loss, Time_(L) is 2.17 s. Time_(O) (4.47 s) isabout 3 times as long as the single data loss. Time_(S) (4.39 s)increases slightly. The running time of OLAP method increases with themissing data increases. The described method's running time is notsensitive to the amount of missing data.

To simulate the subsynchronous oscillation occurred in western Chinacaused by inter-harmonics, the signal contains multiple inter-harmonicsis used to do the simulation. The expression of the signal is asfollows:

$\begin{matrix}{{x(t)} = {{\sqrt{2}X_{m0}{\cos\left( {{2\pi\; f_{0}t} + \varphi_{0}} \right)}} + {\sqrt{2}{\sum\limits_{i = 1}^{5}{X_{mi}{\cos\left( {{2\pi\; f_{i}t} + \varphi_{i}} \right)}}}}}} & (36)\end{matrix}$

where X_(mi) is the amplitude of a certain harmonic phasor, ƒ_(i) is itsharmonic frequency, and φ_(i) is its initial phase angle. Additionally,X_(m1)=1.2 V, ƒ₁=22 Hz, φ₁=0°. X_(m2)=1.5 V, ƒ₂=68 Hz, φ₂=0°. X_(m3)=2.4V, ƒ₃=74 Hz, φ₃=0°. X_(m4)=2.8 V, ƒ₄=84 Hz, φ₄=0°, X_(m5)=3.4 V, ƒ₅=96Hz, and φ₅=0°. The magnitude and phase angle of the superimposed phasormodulated at the frequencies of 18 Hz, 24 Hz, 28 Hz, 34 Hz and 46 Hz.

The mean values of TVE of the recovery results and the running time arepresented in Tables 9 and 10.

TABLE 9 Comparison of recovery accuracy of the signal with fiveinter-harmonics Data loss type TVE_(L)/% TVE_(O)/% TVE_(S)/% Single dataloss (one data loss) 0.25 0.05 0.014 Non-contiguous data loss 3.79 0.180.064 (30% loss) Contiguous data loss (30% loss) 4.38 0.14 0.042

TABLE 10 Comparison of running time of three methods Data loss typeTime_(L)/s Time_(O)/s Time_(S)/s Single data loss (one data loss) 1.141.24 3.76 Non-contiguous data loss 2.08 4.74 4.59 (30% loss) Contiguousdata loss (30% loss) 2.28 4.45 4.33

According to Tables 9 and 10, for the recovery of the subsynchronousoscillation signal, when there is single data loss, TVE_(S) (0.014%) issmaller than TVE_(L) (0.25%) and TVE_(O) (0.05%). When the lost dataratio is 30%, TVES is 0.064%. The accuracy of the described method is atleast 59 times better than Lagrange method and about 3 times better thanOLAP method. For the running time, when there is single data point loss,Time_(L) (1.14 s) is shorter than Time_(O) (1.24 s) and Times (3.76 s).Where there is multiple data point losses, Time_(L) is 2.08 s. Time_(O)(4.47 s) is about 4 times longer than the Time_(O) for single data pointloss. Time_(S) (4.39 s) is shorter than Time_(O). The amount of missingdata has an influence on the OLAP method. If there is lots of missingdata points, the running time gets longer. The running time of thedescribed method has no relationship with the data loss. Therefore, thedescribed method can recover much disturbance missing data.

6.2. Field Data Test

6.2.1. Testing Platform Setup

Subsynchronous oscillation events caused by inter-harmonics haveoccurred numerous times in areas with renewable energy generation inwestern China. In this disclosure, field PMU data during asubsynchronous oscillation is used to verify the correctness of thedescribed method by a test platform.

FIG. 15 shows a test platform according to an example embodiment. Thetest platform is composed of a high accuracy generator (OmicronCMC256-plus) 1510, a GPS 1520, a PMU 1530, a data center 1540, as wellas a data application tool 1550. In one example embodiment, thecorresponding hardware setup the PMU is a self-developed measurementdevice of the renewable energy station (SMD-R).

FIGS. 16 and 17 show example PMU measured inter-harmonic data. Thegenerator is used to generate a signal restored from the field data, asshown in FIGS. 16 and 17, and it is sent out to the PMU. The PMU willthen sense the signal, and upload the calculated dataset to the datacenter via Ethernet. In the data application tool, the PMU dataset isfirst randomly discarded, and then recovered using the developed dataidentification and recovery algorithm. The recovered PMU data is thencompared with the field PMU data to validate the accuracy of thealgorithm.

FIGS. 16 and 17 show that there is a single channel measurement and thereporting rate of PMU is 100 Hz. The subsynchronous oscillation lastsfor about 33 seconds, then the grids arrive at the steady state. In oneminute, 6000 field PMU data points exist, including voltage amplitudeand angle.

6.2.2. Identification Method for Field Data

The field data in FIGS. 16 and 17 is input to the test platform, and theresult are provided as follows.

FIG. 18 shows example identification result of ambient and disturbancedata. As indicated in FIG. 18, the ambient and disturbance data iscompletely distinguished, and the accuracy is 100%.

In order to verify the rationality of the parameter selection, ten otherindependent PMU measurement data points were validated to test itsgeneralization ability. The measurement data of one PMU is extracted astest data each time. FIG. 19 shows example identification results offield data from different PMUs. It is shown from FIG. 19 that theselected parameters of the decision tree algorithm are applicable tofield data from different PMUs, and the accuracy of the identificationresult is as high as 98%.

FIG. 20 shows identification result of ambient and disturbance data whensome data is lost. FIG. 20 illustrates when field PMU data is lost, theidentification algorithm can identify the ambient and disturbance datawell and provide a basis for recovery. When the data is not contiguouslylost, the recognition accuracy is slightly higher.

6.2.3. Field Data Verification of Ambient Data Method

For the ambient data, three methods are used to recover randomlyselected single data, contiguous data, and non-contiguous data.

FIG. 21 shows example comparison of single data recovery results. FIG.22 shows example comparison of contiguous data and non-contiguous datarecovery results. The comparison of running time is shown in Table 11.

TABLE 11 Comparison of running time of three methods Data loss typeTime_(L)/s Time_(O)/s Time_(S)/s Single data loss 0.75 1.37 0.98Non-contiguous data loss 1.13 2.15 1.19 Contiguous data loss 0.98 2.411.21

From FIG. 21, it can be seen that for single data point loss, TVE_(L)fluctuates significantly and the error changes greatly. Each TVE_(C) andTVE_(O) are relatively close, and less than 0.02%, which has obviousadvantages compared with TVE_(L). FIG. 22 shows that when there aremultiple missing data points, TVE_(C) is 0.12% and TVE_(O) is 0.10%.Table 11 demonstrates that for the single data point loss, Time_(L)(0.75 s) is shorter than Time_(O) (1.37 s) and Time_(C) (0.98 s). Ifthere is multiple data point losses, the running time of OLAP becomeslonger. Time_(O) (2.41 s) is about two times than Time_(L) (0.98 s). Therunning time of the described method (Time_(C)=1.21 s) is between theother two. So, the described ambient data recovery method is superior tothe Lagrange method in accuracy and superior to the OLAP in runningtime.

6.2.4. Field Data Verification of Disturbance Data Method

For the disturbance data, three methods are used to recover randomlyselected single data, contiguous data, and non-contiguous data. Theresults of 10 groups of single data loss are as in Table 12. FIG. 23shows example comparison of contiguous data and non-contiguous datarecovery results. The running time of three methods is in Table 13.

TABLE 12 Comparison of recovery results of 10 groups in single data lossData Point TVE_(L)/% TVE_(O)/% TVE_(S)/% Data 1 4.16 0.07 0.025 Data 26.76 0.04 0.015 Data 3 3.21 0.08 0.039 Data 4 6.06 0.07 0.014 Data 52.45 0.05 0.006 Data 6 0.74 0.01 0.018 Data 7 2.53 0.06 0.025 Data 81.69 0.04 0.032 Data 9 6.52 0.05 0.019 Data 10 5.86 0.04 0.014

TABLE 13 Comparison of running time of three methods Data loss typeTime_(L)/s Time_(O)/s Time_(S)/s Single data loss 1.14 1.35 3.53Non-contiguous data loss 2.14 3.72 3.73 Contiguous data loss 2.17 3.753.68

Table 12 shows that for single data point loss, TVE_(S) is all betterthan TVE_(O) and TVE_(L). The mean value of TVE_(S) is 0.0207%, the meanvalue of TVE_(O) is 0.051%, and the mean value of TVE_(L) is 3.998%.FIG. 23 indicates that for the multiple data point losses, the change ofTVE_(L) is the same as that of the ambient data. The mean value ofTVE_(L) is 10.31%. The mean value of TVE_(O) is below 0.1% and the meanvalue of TVE_(S) is below 0.05%. Table 13 shows that when there is onlysingle data loss, the running time of OLAP is 1.35 s. The running timeof the described method is 3.53 s. When there is multiple data loss, therunning time of OLAP becomes 3.75 s, while the described method is 3.68s. Because when there is only single data loss, the running time of thedescribed method is longer than OLAP. With the increase of data loss,the running time of OLAP method increases, and the running time of thedescribed method does not change much. Because OLAP method recovers onedata at a time, the number of missing data determines how many times themethod runs. When a large amount of data is lost, the number of runs ofOLAP method increases, which causes an increase in the running time. Thedescribed method relies on the iterative idea to recover the data withhigh precision. All the missing data can be recovered by one iteration.As the number of iterations increases, the recovered data is moreaccurate. The running time of the described method is related to thenumber of iterations and the amount of missing data has little effect onthe number of iterations. The running time of OLAP method increasesfaster than the described method in this disclosure as the lost dataratio increases. Therefore, the described disturbance data recoverymethod has the higher precision and effectiveness. It is preferredespecially when there is multiple missing data.

6.2.5. Impact of the Dataset Size on the Method

The running time of the four algorithms is compared by varying thedataset size N, which is shown in Table 14. In Table 14, the datasetsize N is from 100 to 500. The data loss ratio is 30%. It can be seenthat the running time of the Lagrange method and the described ambientdata recovery method increases linearly with the increase of thedataset. These two methods run faster because the algorithm is simple.The running time of OLAP method increases. Because the running numberincreases with the increase of the dataset. The running time of thedescribed disturbance data recovery method is less affected by thedataset. Because the running time mainly depends on the number ofiterations and the dataset has less impact on it. When the dataset sizeis from 100 to 400, the running time of OLAP method is shorter than thedescribed disturbance recovery method. When the dataset size N is 500,the running time of the described disturbance data recovery method isshorter than OLAP method. It is because that the number of runs of OLAPmethod increases as the dataset size increases, and the number ofiterations of the described disturbance data recovery method is almostconstant. Therefore, when the dataset size is large, the running time ofthe described disturbance data recovery method is shorter.

TABLE 14 Impact of the dataset size on the method Dataset size NTime_(L)/s Time_(O)/s Time_(C)/s Time_(S)/s 100 0.41 1.42 0.39 3.32 2000.63 2.15 0.65 3.75 300 1.24 2.85 1.28 3.89 400 1.76 3.66 1.84 4.16 5002.24 4.45 2.31 4.33

6.2.6. Impact of Data Loss Ratio on the Method

The recovery of the four methods are compared by changing the percentageof contiguous lost field data, which is shown in Table 15. According toTable 15, the accuracy of Lagrange method decreases quickly with theincreasing data loss. OLAP method cannot recover the missing dataprecisely when the ratio is higher than 30%. As for the describedambient data recovery method, when the loss ratio is less than 15%, themean TVE is 0.65%. As for the described disturbance data recoverymethod, if 40% disturbance data is lost, the mean TVE is still 0.86%.The described disturbance data recovery method can recover dataaccurately with a large amount of data loss.

TABLE 15 Impact of data loss on four methods Data loss ratio Time_(L)/sTime_(O)/s Time_(C)/s Time_(S)/s 10% 3.22 0.21 0.25 0.004 15% 4.82 0.540.65 0.004 20% 6.53 0.76 1.86 0.15 30% 8.17 3.24 5.37 0.32 40% 20.649.37 13.35 0.86 50% 45.14 23.43 29.85 16.57 60% 81.23 35.78 51.35 28.347. Conclusion

This disclosure described an adaptive PMU missing data recovery method.This technique can quickly and accurately recover different types ofmissing data of ambient and disturbance data, and guarantee the datafoundation for the subsequent application of PMU data in the system.

In particular, the adaptive PMU missing data recovery method can choosethe appropriate recovery method based on the types of data loss to meetthe speed and accuracy requirements. Additionally, the identificationmethod can accurately identify the simulation and field PMU data. Themethod can still perform well under the high data loss ratio. Moreover,for the ambient data, the improved cubic interpolation method based onpriority allocation strategy has high calculation accuracy and goodcomputational efficiency for large amounts of data. For the disturbancedata, the method based on singular value decomposition has good recoveryaccuracy and the calculation efficiency does not change much even ifthere is lots of missing data.

8. Technical Implementation of a Computer System

FIG. 24 illustrates exemplary hardware components of a computer system,which may be similar to a controller. A computer system 2400, or othercomputer systems similarly configured, may include and execute one ormore subsystem components to perform functions described herein,including the steps of various flow processes described above. Likewise,a mobile device, a cell phone, a smartphone, a laptop, a desktop, anotebook, a tablet, a wearable device, a server, etc., which includessome of the same components of the computer system 2400, may run anapplication (or software) and perform the steps and functionalitiesdescribed above. Computer system 2400 may connect to a network 2414,e.g., Internet, or other network, to receive inquiries, obtain data, andtransmit information and incentives as described above.

The computer system 2400 typically includes a memory 2402, a secondarystorage device 2404, and a processor 2406. The computer system 2400 mayalso include a plurality of processors 2406 and be configured as aplurality of, e.g., bladed servers, or other known serverconfigurations. The computer system 2400 may also include a networkconnection device 2408, a display device 2410, and an input device 2412.

The memory 2402 may include RAM or similar types of memory, and it maystore one or more applications for execution by processor 2406.Secondary storage device 2404 may include a hard disk drive, floppy diskdrive, CD-ROM drive, or other types of non-volatile data storage.Processor 2406 executes the application(s), such as those describedherein, which are stored in memory 2402 or secondary storage 2404, orreceived from the Internet or other network 2414. The processing byprocessor 2406 may be implemented in software, such as software modules,for execution by computers or other machines. These applicationspreferably include instructions executable to perform the system andsubsystem component functions and methods described above andillustrated in the FIGS. herein. The applications preferably providegraphical user interfaces (GUIs) through which users may view andinteract with subsystem components.

The computer system 2400 may store one or more database structures inthe secondary storage 2404, for example, for storing and maintaining theinformation necessary to perform the above-described functions.Alternatively, such information may be in storage devices separate fromthese components.

Also, as noted, processor 2406 may execute one or more softwareapplications to provide the functions described in this specification,specifically to execute and perform the steps and functions in theprocess flows described above. Such processes may be implemented insoftware, such as software modules, for execution by computers or othermachines. The GUIs may be formatted, for example, as web pages inHyperText Markup Language (HTML), Extensible Markup Language (XML) or inany other suitable form for presentation on a display device dependingupon applications used by users to interact with the computer system2400.

The input device 2412 may include any device for entering informationinto the computer system 2400, such as a touch-screen, keyboard, mouse,cursor-control device, microphone, digital camera, video recorder orcamcorder. The input and output device 2412 may be used to enterinformation into GUIs during performance of the methods described above.The display device 2410 may include any type of device for presentingvisual information such as, for example, a computer monitor orflat-screen display (or mobile device screen). The display device 2410may display the GUIs and/or output from sub-system components (orsoftware).

Examples of the computer system 2400 include dedicated server computers,such as bladed servers, personal computers, laptop computers, notebookcomputers, palm top computers, network computers, mobile devices, or anyprocessor-controlled device capable of executing a web browser or othertype of application for interacting with the system.

Although only one computer system 2400 is shown in detail, system 2400may use multiple computer systems or servers as necessary or desired tosupport the users and may also use back-up or redundant servers toprevent network downtime in the event of a failure of a particularserver. In addition, although computer system 2400 is depicted withvarious components, one skilled in the art will appreciate that thesystem can contain additional or different components. In addition,although aspects of an implementation consistent with the above aredescribed as being stored in a memory, one skilled in the art willappreciate that these aspects can also be stored on or read from othertypes of computer program products or computer-readable media, such assecondary storage devices, including hard disks, floppy disks, orCD-ROM; or other forms of RAM or ROM. The computer-readable media mayinclude instructions for controlling the computer system 2400, toperform a particular method, such as methods described above.

The present disclosure is not to be limited in terms of the particularembodiments described in this application, which are intended asillustrations of various aspects. Many modifications and variations canbe made without departing from its spirit and scope, as may be apparent.Functionally equivalent methods and apparatuses within the scope of thedisclosure, in addition to those enumerated herein, may be apparent fromthe foregoing representative descriptions. Such modifications andvariations are intended to fall within the scope of the appendedrepresentative claims. The present disclosure is to be limited only bythe terms of the appended representative claims, along with the fullscope of equivalents to which such representative claims are entitled.It is also to be understood that the terminology used herein is for thepurpose of describing particular embodiments only, and is not intendedto be limiting.

The invention claimed is:
 1. A method for improving a data quality of aphasor measurement unit including a processor and a memory, the methodcomprising: obtaining, using an interface of the phasor measurementunit, a plurality of datasets C_(i) including a plurality of data pointsfrom a power network; dividing, using the processor, each dataset C_(i)into ambient data or disturbance data using a binary classificationmodel stored on the memory and configured to perform the followingsteps: providing (S1) all the datasets C_(i) as inputs to a decisiontree; calculating (S2) gain ratios of features (a_(i), b_(i)), andselecting a largest dataset C_(i) for comparison with a threshold ε,wherein: the gain ratio is defined as follows:${g\left( {T,a,q_{i}} \right)} = \frac{G\left( {T,a,q_{i}} \right)}{I{V(a)}}$${I{V(a)}} = {- {\sum\limits_{\beta \in {\{{- {, +}}\}}}{\frac{T_{q_{i}}^{\beta}}{T}\log_{2}\frac{T_{q_{i}}^{\beta}}{T}}}}$where IV(a) is an intrinsic value; for each dataset C_(i), a_(i) is avariance, b_(i) is an extreme value difference and q_(i) is a divingpoint; and G(T, a, q_(i)) is information gain of q_(i) which is computedas follows:${G\left( {T,a,q_{i}} \right)} = {{H(T)} - {\frac{T_{q_{i}}^{-}}{T}{H\left( T_{q_{i}}^{-} \right)}\frac{T_{q_{i}}^{+}}{T}{H\left( T_{q_{i}}^{+} \right)}}}$where |T| represents the number of groups, |T_(q) _(i) ⁻|/|T| is aweight of the groups in which a_(i)≤q_(i), and |T_(q) _(i) ⁺|/|T|represents a weight of the groups where a_(i)>q_(i); if the gain ratiois less than the threshold ε, the decision tree is a single node treeand a state of all the datasets is the state of a majority; and if thegain ratio is not less than the threshold ε, g(T, a, q_(a)) is thelargest dataset C_(i); comparing (S3) a_(i) with q_(a), and placingdatasets C_(i) with a_(i)≤q_(a) in a class for ambient data, and placingremaining datasets C_(i) in a class for disturbance data; and repeatingsteps S1-S3 recursively until all datasets in each class are in the samestate, or a maximum gain ratio is less than the threshold ε; restoringmissing data from the class of ambient data by: determining a number Eand a type of data points missing, wherein the number E represents anumber of missing data points; implementing a priority allocationstrategy based on the number E and the type of data points missing;determining an improved cubic spline interpolation function thatsatisfies an extreme value difference constraint and a missing data,wherein S(t) is the cubic spline interpolation function of ƒ(t) on nodet₀, t₁, . . . , x_(n) as written below:${S(t)} = {{M_{i}\frac{\left( {t_{i + 1} - t} \right)^{3}}{6h_{i}}} + {M_{i + 1}\frac{\left( {t - t_{i}} \right)^{3}}{6h_{i}}} + {\left( {\frac{X_{i}}{h_{i}} - {\frac{M_{i}}{6}h_{i}}}\  \right)\left( {t_{i + 1} - t} \right)} + {\left( {\frac{X_{i + 1}}{h_{i}} - {\frac{M_{i + 1}}{6}h_{i}}} \right)\left( {t - t_{i}} \right)}}$where M_(i)=S″(t_(i)), wherein: on each subinterval [t_(k), t_(k+1)](k=1, 2, . . . , n−1), S(t) is a polynomial no more than three times,and s(t_(i))=X_(i); in interval [t_(i), t_(i+1)], the second derivativeof S(t) is expressed as:${S^{''}(t)} = {{M_{i}\frac{t_{i + 1} - t}{h_{i}}} + {M_{i + 1}\frac{t - t_{i}}{h_{i}}}}$where h_(i)=t_(i+1)−t_(i); and two consecutive integrations of S″(t)provide:${{S^{\prime}(t)} = {{{- M_{i}}\frac{\left( {t_{i + 1} - t} \right)^{2}}{2h_{i}}} + {M_{i + 1}\frac{\left( {t - t_{i}} \right)^{2}}{2h_{i}}} + \frac{X_{i + 1} - X_{i}}{h_{i}} - {\frac{M_{i + 1} - M_{i}}{6}h_{i}}}};$ and restoring the missing data using the cubic spline interpolationfunction and known data; restoring missing data from the class ofdisturbance data by: determining an amount of data N and a number ofrows of a track matrix L according to an actual data loss position;constructing a trajectory matrix and performing singular valuedecomposition to extract all characteristic components by mapping aone-dimensional time series X to a trajectory matrix Y of ahigh-dimensional space as follows: $\begin{matrix}{X = {\left. \left\lbrack {X_{t_{1}},X_{t_{2}},\ldots\mspace{14mu},X_{t_{N}}} \right\rbrack\rightarrow Y \right. = \left\lbrack {Y_{1},Y_{2},{\ldots\mspace{14mu} Y_{K}}} \right\rbrack}} & \; \\{{Y_{z} = \left\lbrack {X_{z},X_{z + 1},\ldots\mspace{14mu},X_{z + L - 1}} \right\rbrack^{T}},{z = 1},2,\ldots\mspace{14mu},K} & \; \\{Y = \begin{bmatrix}X_{t_{1}} & X_{t_{2}} & \ldots & X_{t_{m}} & \ldots & X_{t_{n}} & \ldots & X_{t_{K}} \\X_{t_{2}} & X_{t_{3}} & \ldots & X_{t_{m + 1}} & \ldots & X_{t_{n + 1}} & \ldots & X_{t_{K + 1}} \\\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\X_{t_{L}} & X_{t_{L + 1}} & \ldots & X_{t_{m - 1 + L}} & \ldots & X_{t_{n - 1 + L}} & \ldots & X_{t_{N}}\end{bmatrix}} & \;\end{matrix}$ where the matrix Y is a Hankel matrix which has equalelements Y_(ij) on anti-diagonals i+j=const; subscript L is a number ofmatrix rows, which measure the dimension of the matrix Y, and K is anumber of matrix columns, K=N−L+1; reconstructing an original sequenceusing an optimal characteristic component, wherein the original sequenceX decomposition is reconstructed into a sum X(1) of k sequences [X⁽¹⁾,X⁽²⁾, . . . , X^((k))], where: $\begin{matrix}{X^{(k)} = \left\lbrack {X_{t_{1}}^{(k)},X_{t_{2}}^{(k)},\ldots\mspace{14mu},X_{t_{N - 1}}^{(k)},X_{t_{N}}^{(k)}} \right\rbrack} & \; \\\left. {X(1)}\leftrightarrow\begin{bmatrix}X_{t_{1}}^{(1)} & X_{t_{2}}^{(1)} & \ldots & X_{t_{N - 1}}^{(1)} & X_{t_{N}}^{(1)} \\X_{t_{1}}^{(2)} & X_{t_{2}}^{(2)} & \ldots & X_{t_{N - 1}}^{(2)} & X_{t_{N}}^{(2)} \\\vdots & \vdots & \vdots & \vdots & \vdots \\X_{t_{1}}^{(k)} & X_{t_{2}}^{(k)} & \ldots & X_{t_{N - 1}}^{(k)} & X_{t_{N}}^{(k)}\end{bmatrix} \right. & \;\end{matrix}$ replacing a missing data X_(lost) by a correspondingposition value of the reconstructed sequence X(1); providing the missingdata to the phasor measurement unit to support power system dynamicmonitoring, power system state estimation, parameter identification,and/or closed-loop control.